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Introduction 



M 



. ajor depressive disorder (MDD) is one of 
the leading causes of disability and premature mortal- 
ity, and is predicted to become the second most burden- 
some condition worldwide by the year 2020.''^^ Patient- 
reported outcomes (PROs) are increasingly utilized 



Patient reported outcomes (PROs) of quality of life (QoL), functioning, and depressive symptom severity are important 
in assessing ttie burden of illness of major depressive disorder (MDD) and to evaluate the impact of treatment. We 
sought to provide a detailed analysis of PROs before and after treatment of MDD from the large Sequenced Treat- 
ment Alternatives to Relieve Depression (STAR*D) study This analysis examines PROs before and after treatment in 
the second level ofSTAR*D. The complete data on OoL, functioning, and depressive symptom severity, were analyzed 
for each STAR*D level 2 treatment. PROs of QoL, functioning, and depressive symptom severity showed substantial 
impairments after failing a selective serotonin reuptake inhibitor trial using citalopram (level 1). The seven therapeu- 
tic options in level 2 had positive statistically (P values) and clinically (Cohen's standardized differences [Cohen's d]) 
significant impact on OoL, functioning, depressive symptom severity, and reduction in calculated burden of illness. 
There were no statistically significant differences between the interventions. However, a substantial proportion of pa- 
tients still suffered from patient-reported QoL and functioning impairment after treatment, an effect that was more 
pronounced in nonremitters. PROs are crucial in understanding the impact of MDD and in examining the effects of 
treatment interventions, both in research and clinical settings. 
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to assess patients with MDD and consist of self-rating 
scales of symptom severity, functioning, and quality of 
life (QoL). PROs are being adopted, developed, and 
promoted by the World Health Organization (WHO) 
International Classification of Functioning, Disability, 
and Health (ICF),'' US National Institutes of Health 
(NIH) Patient-Reported Outcomes Measurement In- 
formation System (PROMIS),' US Patient-Centered 
Outcomes Research Institute (PCORI),* US Federal 
Food and Drug Administration (FDA)-supported ini- 
tiative for PROs (Critical Path Initiative [CPI]),^ and 
UK National Health Service (NHS) Patient-Reported 
Outcome Measures (PROMs).** 

Depressive symptoms may affect patients' self-re- 
ports of symptom severity, functioning, and OoL,' with 
their influence being shown as a mediator variable.'" 
Similarly, poor OoL, functional impairment and more 
severe symptoms could also result in worsening of de- 
pression.'' This bidirectional relationship continues to 
interfere with the precision of outcome measurements. 
Although widely used, clinician-rated measures of symp- 
tom severity are not immune from patient bias since they 
are primarily based on patient reports in addition to cU- 
nician observation, and they were even reported to yield 
a significant discrepancy when administered to patients 
vs their informants.'^ Except in dysthymic and nonen- 
dogenous depressed groups, empirically designed self- 
report scales tend to have a moderately high correlation 
with clinician-rated ones.'^ Moreover, PROs continue to 
provide valuable information that could not be obtained 
using clinician-rated measures despite the risk of mini- 
mization or magnification of the actual burden of illness. 
In fact, OoL by conceptual and operational definitions 
has to be measured by subjective reporting. Based on the 
WHO definitions, OoL reflects the patient's satisfaction 
with health and life activities, ie, work, love, and play ac- 
tivities by self-report,'"* whereas functioning refers to an 
individual's actual involvement and participation in the 
aforementioned activities as rated by self or observers." 
Unless clinicians are using collateral information, func- 
tioning is primarily measured by self-rating. Individuals 
with MDD frequently suffer from OoL and function- 
ing impairments," and several investigators have dem- 
onstrated that treatment of severe mental disorders, 
including MDD, should not only focus on reduction of 
symptoms, but also seek to enhance levels of functioning, 
and more significantly improve the patient's subjective 
wellbeing and OoL.'"* Since patients remain at the cent- 



er of suffering from depression, their perceptions of the 
dimensions of their burden of illness using PROs should 
remain as fundamental tools for assessment of the effec- 
tiveness of treatment interventions, especially in the era 
of patient-centered health care. Descriptions of the most 
commonly used PROs in MDD appear in Table /."'^' 

In order to present a real-world application of PROs, 
we are seeking to provide a detailed analysis of PROs 
before and after treatment of MDD using different in- 
terventions following the failure of first-line treatment 
with a selective serotonin reuptake inhibitor (SSRI). In 
earlier studies, we examined the effect of level 1 treat- 
ment on functioning, OoL, and the individual burden 
of illness for depression (IBI-D), a vector derived from 
a principal component analysis that captures the vast 
majority of the variance in PROs of depressive symp- 
tom severity, functioning, and QoL in depressed pa- 
tients.'" The present analysis aims at examining PROs 
in patients who failed first-line treatment with the SSRI 
citalopram, before and after enrolment in level 2 us- 
ing seven interventions as second-line interventions in 
MDD. Although many publications'^"'* have reported 
findings from the Sequenced Treatment Alternatives to 
Relieve Depression (STAR*D) study, patient-reported 
functional outcomes were seldom described. Moreover, 
previous publications have traditionally focused on re- 
sults based on P values, whereas the current analysis 
adds the examination of effect sizes in order to assess 
more clinically meaningful effects of the interventions 
as assessed using patient-reported measures. 

Methods 

Study population 

STAR*D is the largest study ever conducted on MDD 
treatment, and featured multiple levels of pharmaco- 
therapy and psychotherapy trials. Patients failing one 
trial of SSRI monotherapy (level 1 ) , were either switched 
from citalopram to sertraline, venlafaxine, bupropion, or 
cognitive therapy, or kept on citalopram and augmented 
with bupropion, buspirone, or cognitive therapy (level 2). 
The STAR*D study was funded by the National Institute 
of Mental Health (NIMH) and is the largest study aimed 
at analyzing subsequent treatment steps for patients with 
treatment-resistant MDD. The design and rationale of 
STAR*D are detailed elsewhere.""" The study enrolled 
4041 18- to 75-year-old outpatients with nonpsychotic 
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Name 


Time to 
complete 


Ml imhor 

of items 


scale 


range 


Higher 
score is 


SUIIIIIIaiy 


QoL patient-reported outcomes 












Q-LES-Q 

Quality of Life, Enjoy- 
ment, and Satisfaction 
Questionnaire — Short 
Form 


5 min 


16 


1-5 


0-100 


Better 


The Q-LES-Q assesses QoL covering the following 
domains: health, mood, work, household activities, 
social and family relationships, leisure, ability to 

"Fiinr+ion H^il\/ ^py {^rc\Y^r\YT\'\r ^nrl livinri ^itiiation 

1 LI 1 IV.. LI 1 \JC3liy, C^UI 1 kJI 1 1 1 ^_ a 1 1 U IIVIIIU 31 LUCl LI Wl 1, 

mobility, vision, wellbeing, medications, and 
overall satisfaction. The total score is calculated 
as the sum of scores from items 1 through 14 and 
is converted to a percentage using the following 
calculation: ([raw score-14]/56) x 100. Community 
norms scores had a mean of 78.3%." 


SF-36 

IVledical outcomes 
study — Short Form 36 


15 min 


36 


0-5 


0-100 


Better 


The SF-36 and its brief form the SF-1 2 measure QoL 
on eight health concepts: 

1. Limitations in physical activities because of 
health problems, 

2. Limitations in social activities because of physical 
or emotional problems, 

3. Limitations in usual role activities because of 


SF-12 

IVledical outcomes 
study — Short Form 12 


5 min 


12 








physical health problems, 

4. Bodily pain. 

5. General mental health (psychological distress 
and well-being). 


WHOQOL 

WHO Quality of Life 


25 min 


100 


1-5 


0-100 


Better 


The WHOQOL and its brief form the WHOQQL- 
BRF are focused around the definition of QoL 
advocated by WHO; this includes the culture and 


WHOQOL-BREF 

26-item version 


10 min 


26 








context that influence an individual's perception 
of health. They measure four domains: physical 
health, psychological health, social relationships, 
and environment.^' 


EQ-5D 
EuroQoL 


3 min 


5 


1-3 


-1.0 to 
1.0 


Better 


The E0-5D measures QoL using five single-item 
measures of mobility, self-care, usual activities, 
pain/discomfort, and anxiety/depression. Scores 
range from 1 .0 (perfect health) to -1 .0 (death). 
It has an additional visual analog scale ranging 
from 0 (worst imaginable health state) to 100 (best 
imaginable health state)." 


PROMIS-GHS 

Patient-Reported Out- 
comes Measurement 
Information System 
Global Health Scale 


5 min 


10 


0-3 


0-100 


Better 


The PRQMIS-GHS measures health and QoL by asses- 
sing five primary domains: physical function, fatigue, 
pain, emotional distress, and social health. Scoring 
results in a "physical health" component and a 
"mental health" component each with a mean of 50 
(SD, 10), where higher or lower scores indicate better 
or worse health than the population.^^ 


Functioning patient-reported outcomes 


WSAS 

Work and Social Adjust- 
ment Scale 


3 min 


5 


0-8 


0-40 


Worse 


The WSAS measures functioning in the work, 
home management, private leisure, social leisure, 
and relationship domains. The sum of the scores 
produces a total score where a score >20 indicates 
major functional impairment, 10-20 indicates signi- 
ficant functional impairment, and scores <10 are 
within normal range. 



Table I. Patient-reported outcomes of quality of life (QoL), functioning, and depressive symptom severity DSM-IV, Diagnostic and Statistical 
Manual of Mental Disorders — IV; MDD, major depressive disorder; SD, standard deviation; WHO, world Health Organization 
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Name 


Time to 
complete 


Number 
of items 


Item 
scale 


Score 
range 


Higher 
score is 


Summary 


WHODAS 2.0 

WHO Disability Assess- 
ment Scliedule 2.0 


15 min 


36 


0-3 


0-100 


Worse 


The WHODAS 2.0 and its brief 12-item version 
measure functioning in: cognition (understanding 
and communicating); mobility (moving and getting 
around); self-care (hygiene, dressing, eating, and 
staying alone); getting along (interacting with 
other people); life activities (domestic responsibi- 
lities, leisure, work, and school); and participation 


WHODAS 2.0 

12-item version 


5 min 


12 








(joining in community activities). Scoring utilizes 
one of two methods: simple scoring involves simple 
sum of the score, and complex scoring uses a script 
converting the score using item-response theory 
to a range from 0 (no disability) to 100 (total 
disability)." 


SDS 

Sheelnan Disability Scale 


3 min 


3 


0-10 


0-30 


Worse 


The SDS assesses functioning in the domains of 
work, social life, and family life/home responsibili- 
ties. The sum of the scores lead to a total score ran- 
ging from 0 (unimpaired) to 30 (highly impaired). 
Scores >5 on any of the domains or total score >8 
are indicative of functional impairment." 


EWPS 

Endicott Work Producti- 
vity Scale 


10 min 


25 


0-4 


0-100 


Worse 


The EWPS covers twenty-five aspects of work/job 
functions such as being on time, accomplishing tasks, 
and performance. The item scores are summed up to 
a total score that ranges from 0 (no impairment) to 
100 (major impairment in work productivity)." 


Depressive symptom severity patient-reported outcomes 


QIDS-SR 

Quick Inventory of 
Depressive Symptoma- 
tology — Self Report 


10 min 


16 


0-3 


0-27 


Worse 


The QIDS-SR measures the severity of 16 depressive 
symptoms. The total score is a sum of the highest 
score on any one of four sleep items (1-4) + item 
(5) + the highest score on any one appetite/weight 
item (6-9) + items (10-14) + the highest score on 
either of the two psychomotor items (1 5 and 16). 
Severity of MDD depressive symptoms is catego- 
rized based on the QIDS-SR scores: 0-5 (remission), 
6-10 (mild), 1 1-15 (moderate), 16-20 (severe), or 
>20 (very severe)." 


BDI-II 

Beck Depression Inven- 
tory II 


10 min 


21 


0- 


0-6 


Worse 


The BDI-II measures the severity of 21 depressive 
symptoms. The total score is the sum of all items. 
Depression severity is categorized with scores of 
0-13 (minimal depression), 14-19 (mild depres- 
sion), 20-28 (moderate depression), 29-63 (severe 
depression)." 


CUDOS 

Clinically Useful Depres- 
sion Outcome Scale 


5 min 


18 


0-4 


0-64 


Worse 


The CUDOS rates 16 DS/W-/\/ depression symptoms 
from "not at all true" (0 days) to "almost always 
true" (every day), item 17 rating interference with 
functioning, and item 18 rating quality of life. The 
total score is the sum of the first 16 items, ranging 
from 0-10 (nondepressed), 11-20 (minimal depres- 
sion), 21-30 (mild depression), 31-45 (moderate 
depression), or 46 and above (severe depression).^" 


CES-D 

Center for Epidemiolo- 
gic Studies Depression 
Scale 


10 min 


20 


0-3 


0-60 


Worse 


The CES-D measures the severity of 20 depressive 
symptoms from "rarely" to "most of the time". 
The score is the sum of the 20 questions. A score of 
16 points or more is considered as "depressed".^' 



Table I. Continued 
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MDD from 18 primary care and 23 psychiatric care prac- 
tice settings across the United States from 2001 to 2007. 
There are up to four sequential treatment levels, but this 
study focuses only on level 2, which includes seven treat- 
ment options. Patients from level 1 who were unable to 
tolerate citalopram, as well as those who failed to show 
adequate improvement, were offered three augmenta- 
tion options (adding sustained-release bupropion, bus- 
pirone, or cognitive therapy in addition to the level 1 cit- 
alopram) and four switch options (replacing citalopram 
with sertraline, sustained-release bupropion, extended- 
release venlafaxine, or cognitive therapy). The authors 
obtained a NIMH data use certificate to access and use 
the STAR*D Pub Verl dataset for this analysis. The com- 
plete data on QoL, functioning, and depressive symptom 
severity was analyzed for each level 2 treatment. 

Outcome measures 

Quality of Life, Enjoyment, and Satisfaction 
Questionnaire-Short Form ( Q-LES-Q-SF) 

The Q-LES-Q-SF,'' used to assess QoL, is a self-report 
questionnaire that measures satisfaction and enjoy- 
ment in a series of discrete domains and life activities. 
This study uses the short version, which has 16 items: 
physical health, mood, work, household activities, social 
relationships, family relationships, leisure time activi- 
ties, ability to function in daily life, sexual drive/interest/ 
performance, economic status, living/housing situation, 
ability to get around physically, vision, and overall sense 
of wellbeing. Each item is scored on a 5-point Likert 
scale, where l=very poor, 2=poor, 3=fair, 4=good, and 
5=very good. Adding the results of the 14 first items. 



then dividing by the maximum score, and multiply- 
ing this figure by 100 gives a total score ranging from 
0 to 100, with 0 being the lowest QoL score and 100 
the highest. Community norm samples have a mean Q- 
LES-Q-SF score of 78.3 (standard deviation [SD], 11.3) 
and scores within 10% of this value, ie, Q-LES-Q-SF 
>70.47, are considered "normal", whereas Q-LES-Q-SF 
scores greater than 2 SD below the community norm 
scores, ie, Q-LES-Q-SF <55.7, are considered "severely 
impaired." " The Q-LES-Q-SF has a Cronbach a of 0.90 
and test-retest reliability of 0.74, demonstrating strong 
psychometric properties.'' 

Work and Social Adjust Scale (WSAS) 

The WSAS"*' was used to assess functioning. The WSAS 
is a five-item self-report questionnaire that measures 
impairment of functioning in various settings: work; 
home management, eg, cleaning, tidying, shopping, 
cooking, looking after home or children, and paying 
bills; social leisure time, ie, activities done with other 
people, such as parties, clubs, outings, dating, and home 
entertainment; private leisure time, ie, activities done 
alone, such as reading, gardening, collecting, sewing, 
and walking alone; and ability to form and maintain 
close relationships with others. Each item is scored 
on a visual analogue scale ranging from 0 (no impair- 
ment at all), to 8 (very severe impairment). The sum of 
the results of the five items gives a total score ranging 
from 0 (best functioning) to 40 (worst functioning). Se- 
vere impairment is indicated by scores above 20."" The 
WSAS has a Cronbach a varying from 0.70 to 0.94 and 
test-retest reliability of 0.73, also demonstrating strong 
psychometric properties.*' 



Name Time to Number Item Score Higher Summary 

complete of items scale range score is 



PHQ-9 

Patient Health Ques- 
tionnaire 


5 min 


9 


0-3 0-27 Worse The PI-IQ-9 measures the nine depressive symptoms 
from the DSM-IV. The total score is the sum of the 
nine items with scores of 1-4 (minimal depression), 
5-9 (mild depression), 10-14 (moderate depression), 
15-19 (moderately severe depression), or 20-27 
(severe depression). 


PROMIS Depression 

Patient-Reported Out- 
comes Measurement 
Information System 
Depression Scale 


5 min 


8 


1-5 0-100 Better The PROMIS Depression scale measures negative 
mood, view of self, social cognition, decreased 
positive affect, and engagement. The raw score is 
then converted to a T score that has a population 
mean of 50 (SD, 10)." 



Table I. Continued 
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Quick Inventory of Depressive Symptomatology — 
Self Report (QIDS-SR) 

The QIDS-SR^'^ was used to assess severity of depres- 
sive symptoms. The QIDS-SR is a 16-item question- 
naire corresponding to the nine DSM-IV criteria of 
major depression: one item for each of the following 
symptoms: depressed mood, decreased interest, de- 
creased energy, worthlessness/guilt, concentration/ 
decision making, and suicidal ideation; four items to 
assess sleep: early, middle, and late insomnia, and hy- 
persomnia; two items to assess psychomotor distur- 
bance: agitation and retardation; and four items to 
assess appetite/weight: appetite increase or decrease, 
and weight increase or decrease. Each item is rated 0 
to 3. The QIDS-SR score is calculated by summing the 
scores of the items. In domains utilizing more than one 
item (eg, the four items for sleep disturbance), only the 
highest score is utilized in the total score.^** The QIDS- 
SR scores range from 0 (not depressed) to 27 (most 
severely depressed). A score of five or less indicates 
remission, which is the goal of treatment. The QIDS- 
SR has high internal consistency (Cronbach a, 0.86) 
and is highly associated with the three versions of the 
clinician-rated Hamilton Rating Scale for Depression, 
Montgomery-Asberg Depression Rating Scale, and 
the Beck Depression Inventory.^** 

Individual Burden of Illness for Depression (IBI-D) 

The burden of illness was measured using the IBI-D, 
a newly introduced measure that incorporates QoL, 
functioning, and depressive symptom severity.'* The 
IBI-D is the first and only statistically significant prin- 
cipal component obtained from a principal component 
analysis of the above three well-validated PRQs of 
depressive symptom severity (QIDS-SR), functioning 
(WSAS), and QoL (Q-LES-Q-SF). IBI-D is a z-score 
that references patients in level 1 of STAR*D, where 
values around 0 represent a burden similar to the av- 
erage depressed patient, a burden greater than +2 in- 
dicates that the patient has an unusually high burden 
(higher than the top 2% of depressed patients), and val- 
ues lower than -2 indicate that the patient has a lower 
burden (lower than 98% of depressed patients). The 
IBI-D was shown to adequately capture the multidi- 
mensional impact of antidepressant treatment,*^ and to 
adequately predict relapse in MDD.*' 



Statistical methods 

Summary values are expressed as means and SDs for 
continuous variables, and frequencies (%) for categori- 
cal variables. We calculated effect sizes using Cohen's 
standardized differences (Cohen's d),iTi order to assess 
clinical significance in addition to statistical significance. 
Cohen's d values of 0.2, 0.5 , and 0.8 describe small, medi- 
um, and large effects, respectively. Within the treatment 
groups, changes from entry to exit (As) on continuous 
variables were assessed for significance using paired t 
tests. Between-group differences on continuous vari- 
ables were assessed for significance using independent 
sample / tests. We calculated and compared the propor- 
tions of patients with normal QoL (using Q-LES-Q-SF) 
and functioning (using WSAS) and with severe impair- 
ments on both measures. Within the treatment groups, 
preintervention vs postintervention P values were cal- 
culated using McNemar's test for related proportions. 
Five tests were performed for each outcome measure: 
two within-group tests and three between-group tests 
(entry, change, and exit). Thus, we used a Bonferroni- 
adjusted 0.01 significance level for each test. Analyses 
were performed using SAS software version 9.2 (SAS 
Institute Inc, Cary, NC, USA). 

Results 

STAR*D used an equipoise stratified randomized 
design.'** In level 2, patients with the help of their 
clinicians considered either switching or augmenting 
citalopram using a new medication or cognitive therapy. 
Patients could decline any strategy, however there had 
to be at least two possible strategies, to one of which the 
patient was randomized.*'' 

Study population demographics 

Complete QoL, functioning, and depressive symptom 
severity data from STAR*D level 2 trial subjects were 
analyzed (n=749). The mean age was 44.4 years (SD, 
12.4) with a range from 18.8 to 75 years. Caucasians rep- 
resented 81.6% (n=611), while Hispanics accounted for 
12.3% (n=92) in this sample. Females made up 60.1% 
(n=450) and 25.1% (n=188) were college graduates. 
Slightly more than half, 51.9% (n=389), were employed 
at the start of the study and 42.1% (n=330) were living 
with a spouse or partner. 
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QoL, functioning, and depressive symptom severity 
scores 

Level 2 STAR*D entry and exit scores for QoL (Q- 
LES-Q-SF), functioning (WSAS), and depressive 
symptom severity (QIDS-SR) were analyzed for the 
each of the seven treatment options separately to see 
each option's effect in MDD. 

Pretreatment scores 

The Q-LES-Q-SF scores ranged from 38.5 to 45.2 with a 
mean score of 42.1 (SD, 15.5). The WSAS scores ranged 
from 22.4 to 24.1 with a mean score of 23.6 (SD, 9.0). 
The QIDS-SR scores ranged from 13.1 to 15.7 with a 
mean score of 14.3 (SD, 4.7). No statistically significant 
differences were found between the seven treatment 
groups. 

Post-treatment scores: impact of treatmem on QoL, 
functioning, and depressive symptoms 

All seven treatment options led to significant improve- 
ments in patients, both statistically (/'<0.01) and clini- 
cally (Cohen's d, >0.4). The treatment with the highest 
effect size for QoL, as measured by Q-LES-Q-SF, was 
switching to cognitive therapy (Cohen's J, 0.73) fol- 
lowed by switching to sertraline (Cohen's d, 0.67). The 
lowest effect sizes were for switching to bupropion and 
augmenting citalopram with cognitive therapy (Co- 
hen's d, 0.42). For functioning, as measured by WSAS, 
the highest effect sizes were with switch to cognitive 
therapy (Cohen's d, 0.78), followed by sertraline, ven- 
lafaxine, and citalopram plus bupropion (Cohen's d, 
0.62), and the lowest was bupropion alone (Cohen's d, 
0.47). The highest effect sizes for depressive symptoms 
severity, as measured by QIDS-SR, were for venlafax- 
ine (Cohen's d, 0.88) and cognitive therapy (Cohen's d, 
0.83), with the lowest being citalopram plus buspirone 
(Cohen's d, 0.48). It is especially important to note the 
high effect sizes for venlafaxine and cognitive therapy 
(Cohen's d, >0.8), indicating large, clinically significant 
improvements. In general, the effect sizes were higher 
for patient-reported depressive symptom severity, than 
for QoL or functioning. 

Proportions of patients scoring witMn normal QoL 
and functioning 



Level 2 STAR*D entry and exit proportions of pa- 
tients with normal QoL (Q-LES-Q-SF >78.3) and 
functioning are presented in Table //.i"'s,47 

Pretreatment proportions 

Before treatment, depressed patients scoring for a 
normal QoL ranged from 0% to 5%, with a mean pro- 
portion of 2.9%. Patients within normal functioning 
ranged from 1.7% to 11.5%, with a mean proportion 
of 6.4%. 

Post-treatment proportions: 
impact of treatment 

The posttreatment data revealed a statistically signifi- 
cant increase in the proportion of patients with nor- 
mal QoL and functioning. All P values were less than 
0.0005 for QoL and functioning, except for citalopram 
plus cognitive therapy (QoL, P=0.63; functioning, 
P=0.012). However, the proportion of patients with 
normal QoL and functioning upon exit ranged from 
Q-LES-Q-SF scores of 6.4% to 31.3%, and WSAS 
scores of 21.3% to 43.8%, with the mean percentage of 
patients with normal QoL of 19.5%, and with normal 
functioning of 31.9%. In other words, following treat- 
ment, only 1 out of 5 patients achieved a normal QoL 
and less than one third of patients achieved normal 
functioning. 

Proportions of patients with severe impairments in 
QoL and functioning 

Level 2 STAR*D entry and exit proportions of patients 
with severe impairments in QoL (2 SD below com- 
munity norms, ie, Q-LES-Q-SF <55.7) and functioning 
(WSAS >20) are displayed in Table //.w-^.tv 

Pretreatment proportions 

Before treatment, severe impairments in QoL and func- 
tioning were detected in a large proportion of patients, 
with an overall higher percentage of patients with se- 
verely impaired QoL (range, 77.2% to 93.6%; mean, 
83.3%) than functioning (range, 57.7% to 66.1%;mean, 
62.5%), ie, nearly 4 out of 5 patients suffer from severe 
QoL impairment and 2 out of 3 patients suffer from se- 
vere functional impairment. 
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Post-treatment proportions: impact of treatment 

The post-treatment data revealed a statistically signifi- 
cant reduction in the proportion of patients suffering 
from severely impaired QoL (P<0.005). Several of the 
functional impairment results were not statistically sig- 
nificant at the Bonferroni-adjusted 0.01 significance 
level. The proportion of patients severely impaired in 
QoL and functioning upon exit ranged from 68.6% to 
43.8% (mean, 59.5%) and from 49.6% to 37.5% (mean, 
44.3%), respectively, with a mean of 59.5% (QoL) and 
of 44.3 % (functioning) for the whole sample. In other 
words, nearly 2 out of 3 patients still struggled with se- 
vere impairments in QoL, and less than half still experi- 
enced severe impairments in functioning. Nonremitters 
had substantially smaller proportions of patients within 



normal QoL and functioning, and larger proportions 
with severe impairments in QoL and functioning at exit. 

Individual burden of illness for depression (IBI-D) 
scores 

Level 2 STAR*D entry and exit scores for burden of ill- 
ness for depression (IBI-D) are displayed in Table III. 

Pretreatment scores 

Generally, the baseline IBI-D scores of remitters were 
lower than that of nonremitters, suggesting that remit- 
ters started out with less burden of illness overall. For 
remitters (P<0.0001), the treatment group with the most 
burden at baseline was cognitive therapy {z score, -0.13), 



Intervention 


n 


Within 
normal 
QoL 
Pre (%) 


Within nor- 
mal QoL 
Post (%) 


Mcnemar 
Test 
P value 


Severely 
impaired 
QoL 
Pre (%) 


Severely 
impaired 
QoL 

Post (%) 


Mcnemar 
test 
P value 


Bupropion 


121 


5.0 


20.7 


0.0002 


84.3 


68.6 


0.002 


Sertraline 


131 


0 


17.6 


<0.0001 


89.1 


62.6 


<0.0001 


Venlafaxine 


121 


1.7 


14.0 


0.0003 


89.3 


61.2 


<0.0001 


Cognitve tlierapy 


32 


3.1 


31.3 


0.012 


84.4 


43.8 


0.001 


Citalopram+bupropion 


148 


4.1 


23.6 


<0.0001 


75.0 


53.4 


<0.0001 


Citalopram +buspirone 


149 


4.0 


22.1 


<0.0001 


77.2 


55.0 


<0.0001 


Citalopram +cognitve tlierapy 


47 


2.1 


6.4 


0.63 


93.6 


68.1 


0.004 


All 


749 


2.9 


19.5 


<0.0001 


83.3 


59.5 


<0.0001 


Intervention 


n 


Within 
normal 
functioning 
Pre (%) 


Within 
normal 
functioning 
Post (%) 


Mcnemar 
Test 
P value 


Severely 
impaired 
functioning 
Pre (%) 


Severely 
impaired 
functioning 
Post (%) 


Mcnemar 
test 
P value 


Bupropion 


121 


6.6 


25.6 


<0.0001 


64.5 


47.9 


0.0008 


Sertraline 


131 


5.3 


31.3 


<0.0001 


65.6 


49.6 


0.002 


Venlafaxine 


121 


1.7 


28.1 


<0.0001 


66.1 


45.5 


0.0001 


Cognitve therapy 


32 


3.1 


43.8 


0.0002 


65.6 


37.5 


0.012 


Citalopram +bupropion 


148 


11.5 


39.2 


<0.0001 


58.8 


41.9 


0.0003 


Citalopram +buspirone 


149 


8.1 


34.2 


<0.0001 


57.7 


39.6 


<0.0001 1 


Citalopram +cognitive therapy 


47 


2.1 


21.3 


0.012 


66.0 


44.7 


0.031 


All 


749 


6.4 


31.9 


<0.0001 


62.6 


44.3 


<0.0001 



Table II. Proportions of patients with normal quality of life (QoL) and functioning before and after each intervention. Normal QoL is defined as 
Q-LES-Q-SF scores within 10% of community norms, and severe impairment is defined as Q-LES-Q-SF scores greater than 2 standard 
deviations (SD) below the community norms. Since community norm samples have an average Q-LES-Q-SF of 78.3 (SD, 1 1 .3), a Q-LES- 
Q-SF >70.47 is considered within normal and a Q-LES-Q-SF <55.7 is considered severely impaired.^'"'''" Normal functioning is defined as 
WSAS scores of less than 10 and severe impairment is defined as WSAS scores of more than 20.^^ n, number 



178 



PRO in major depressive disorder - IsHak et al 



Dialogues in Clinical Neuroscience - Vol 16 ■ No. 2 ■ 2014 



while the treatment group with the least burden was cit- 
alopram plus bupropion {z score, -0.77) followed by cit- 
alopram plus buspirone (z score, -0.76). For nonremitters 
(all P values, <0.0001 ) , the treatment group with the most 
burden was sertraline (z score, 0.25) followed by venla- 
faxine (z score, 0.24), while the treatment group with 
the least was citalopram plus cognitive therapy (z score, 
-0.10) followed by cognitive therapy (z score, -0.11) and 
citalopram plus bupropion (z score, -0.11). 

Post-treatment scores: impact of treatment 

Posttreatment, the data revealed an overall statisti- 
cally significant reduction in the burden of illness for 
depression (all P values, <0.0001). For both remitters 
and nonremitters (all P values, <0.0001), the treatment 
group with the least burden upon exit was cognitive 
therapy (change for IBI-D scores in remitters from 
-0.13 to -2.68; nonremitters from -0.11 to -0.50). Addi- 
tionally, the treatments that led to the greatest decrease 
in burden (all P values, <0.0001) were cognitive therapy 
(change for IBI-D scores in remitters, -2.55; nonremit- 
ters, -0.39), sertraline (change for IBI-D scores in remit- 
ters, -1.96; nonremitters, -0.42), and venlafaxine (change 
for IBI-D scores in remitters, -1.90; nonremitters, -0.49). 

Differences between the seven interventions on PROs 

There were no statistically significant differences between 
the interventions. Interestingly, switching to cognitive ther- 
apy stood out numerically with the greatest effect size for 
QoL (Cohen's d, 0.73), the largest proportion of patients 
with normal QoL and functioning upon exit (31.3%), the 
lowest proportion of patients severely impaired (43.8%), 
as well as the lowest IBI-D score upon exit (z score, -0.50). 

Discussion 

The main findings of the present study are: (i) PROs 
show that the seven level 2 treatment options produced 
significant functional improvements that were significant 
both statistically (P<0.01) and clinically (Cohen's d>QA)\ 
and (ii) patient-reported functional outcomes revealed 
that a substantial proportion of patients who had failed a 
first-line trial with citalopram, still experienced grave im- 
pairments in QoL and functioning after treatment with 
second-line augmentation or switching interventions, an 
effect that is more pronounced in nonremitters. 



The seven different treatments of MDD in patients 
who have failed initial citalopram monotherapy show a 
significant positive impact on QoL, functioning, and de- 
pressive symptom severity using patient-reported meas- 
ures. The use of effect size enabled us to assess the mag- 
nitude of this impact after ascertaining that it did not 
happen by chance, ie, after establishing statistical signifi- 
cance. Qn depressive symptom severity, other STAR*D 
analyses^"** concluded that bupropion, sertraline, venla- 
faxine, and cognitive therapy, as well as citalopram plus 
bupropion, citalopram plus buspirone, and citalopram 
plus cognitive therapy, lead to similar outcomes. How- 
ever, this study is unique in including patient-reported 
functional outcomes. In general, the effect sizes were 
the highest for patient-reported depressive symptom 
severity, followed by functioning, then QoL. Switching 
to cognitive therapy alone, after failing first-line SSRI 
treatment, achieved numerical superiority, which is a 
finding that might be worthy of future exploration. Al- 
though the usefulness of cognitive therapy in refractive 
MDD has long been established''^ with respect to reduc- 
ing symptoms, the present study showed that the cogni- 
tive therapy group displayed the largest proportion of 
patients with normal QoL and functioning upon exit, 
the smallest proportion of patients severely impaired, 
the greatest effect size for both QoL and functioning, 
and also displayed the lowest IBI-D score upon exit. 
Previous STAR*D analyses utilized remission (not 
patient-reported QoL or functioning) as primary out- 
comes to show that cognitive therapy alone, or in addi- 
tion to citalopram, was as effective as the other level 2 
pharmacologic options with "comparable outcomes."^* 
The above results for cognitive therapy need to be in- 
terpreted with extreme caution; the results should only 
be considered as hypothesis-generating, since STAR*D 
was not designed to compare it with other treatments 
and the sample size was small. 

Although the impact of the seven interventions is sta- 
tistically and clinically significant, a substantial propor- 
tion of patients failed to achieve normal scores on QoL 
and functional patient-reported measures. Nonremit- 
ters showed remarkably large proportions with severe 
impairments in QoL and functioning. We also observed 
considerably small proportions of nonremitters who ex- 
perienced normal QoL and functioning. Moreover, our 
analysis shows that, even after achieving remission, only 
half of the remitters scored a normal QoL. This finding 
adds more credence to the notion that remission (mini- 
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mal or no symptoms) does not reflect full recovery of 
functional outcomes, and perhaps more ways to improve 
QoL and functioning will need to be researched and ap- 
plied in clinical settings.^^ 

Strengths and limitations 

Our study suffers from a number of limitations, related 
both to our own analysis and to the STAR*D study de- 
sign. It is important to note the original STAR*D study 
was not designed to accommodate comparison of level 
2 interventions. It could be speculated that the lack of 
equipoise in patient randomization affected statistical 
power and, down the line, clinical significance. It would 
be useful to design another experiment specifically for 
this purpose, with a more balanced distribution of pa- 
tients. Additionally, it would be informative to compare 
cognitive therapy to other forms of psychotherapy that 
have already proven their usefulness in treating MDD, 
such as interpersonal psychotherapy."" 

STAR*D lacked a control or a placebo group, which 
could have provided useful comparative data and helped 
control for the placebo effect and the passage of time as 
factors of remission. However, this may be of less signifi- 
cance for level 2 than for level 1 data. No blinding was re- 
quired for the physicians and the patients involved. There 
is also a dearth of information regarding dropouts. War- 
den et al* have demonstrated that African- Americans, 
young patients, individuals with lower education, and pa- 
tients with lower income were more likely to drop out of 
the STAR*D study than other groups.^" The vast major- 
ity (90%) of cases of attrition in level 2 of the STAR*D 



study were shown to be motivated by nonmedical rea- 
sons.'^ This finding might explain why no difference was 
found in the percentage of patients that exited the study 
in the drug group compared with the cognitive therapy 
group, which lacked a drug side effect.^' Medical predic- 
tors of attrition included medication side effects and the 
presence of Diagnostic and Statistical Manual of Mental 
Disorders (DSM) axis I comorbidities. Attrition makes 
it more difficult to extrapolate conclusions from the 
sample studied to the general population; it is therefore 
important for future studies to account for missing data 
from attrition to avoid selection bias. Another potential 
weakness in this present analysis is the lack of control 
for coexisting symptoms, such as anxiety, insomnia, and 
loss of energy. Gaynes et aP- have suggested that, while 
the latter do affect remission rates, symptoms such as loss 
of energy may guide medication selection according to 
side-effect profile.""- 

One of the most important strengths of this present 
study is its examination of effect sizes. While STAR*D 
studies have traditionally analyzed results using P 
values, the current analysis uses Cohen's d in order to 
compare the relative strengths of the seven interven- 
tions. This estimation of magnitude serves to comple- 
ment the statistical inference supplied by P values. 

Another important strength was the reliance on PROs. 
The instruments have already demonstrated strong psy- 
chometric properties, and provided unique perspective on 
functional outcomes that are difficult to obtain by clinician 
rating. PROs were identified to play an important role not 
only in examining the impact of treatment interventions, 
but also in predicting relapse in MDD.'''' 











All 








Remitters 


Intervention 


n 


IBI-D 
Pre 


IBI-D 
Post 




IBI-D Change 


P 


n 


IBI-D 
Pre 


Bupropion 


121 


-0.02 (1.07) 


-0.67 (1 


.44) 


-0.66 (1.21) 


<0.0001 


26 


-0.58 (1.03) 


Sertraline 


131 


0.04 (0.95) 


-0.79 (1 


.43) 


-0.83 (1.15) 


<0.0001 


35 


-0.54 (0.93) 


Venlafaxine 


121 


0.08 (0.92) 


-0.81 (1 


.35) 


-0.89 (1.08) 


<0.0001 


34 


-0.32 (0.79) 


Cognitive tinerapy 


32 


-0.12 (0.85) 


-1.32 (1 


.41) 


-1.20 (1.46) 


<0.0001 


12 


-0.13 (0.94) 


Citalopram + Bupropion 


148 


-0.34 (0.94) 


-1.14 (1 


.35) 


-0.80 (1.09) 


<0.0001 


53 


-0.77 (0.77) 


Citalopram + Buspirone 


149 


-0.30 (0.98) 


-0.90 (1 


.42) 


-0.59 (1.07) 


<0.0001 


38 


-0.76 (0.72) 


Citalopram + Cognitive therapy 


47 


-0.14(0.86) 


-0.86 (1 


.09) 


-0.72 (1.14) 


<0.0001 


10 


-0.32 (0.73) 


All 


749 

















. Lomparisons tor /4y patients with Dotn pre- ana post-treatment values using the maividual Burden ot Illness tor Depression (IBI-U). 
Values are means (standard deviation [SD]). Paired t test (within intervention change: exit vs base). P=0.082 (nonsignificant [NS]) for 
difference in base to exit change between interventions, confirmed by Welch's analysis of variance (ANOVA; P=0. 1 43; NS). n, number 
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Conclusion 

PROs showed that all seven treatment options led to 
statistically and clinically significant improvements, 
however a substantial proportion of patients still suf- 
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Remitters Nonremitters 



IBI-D 


IBI-D Change 


P 


n 


IBI-D 


IBI-D 


IBI-D Change 


P 


Post 








Pre 


Post 






-2.40 (0.78) 


-1.82 (1.12) 


<0.0001 


95 


0.14 (1.03) 


-0.20 (1.19) 


-0.34(1.02) 


<0.0001 


-2.50 (0.60) 


-1.96 (0.82) 


<0.0001 


96 


0.25 (0.87) 


-0.16 (1.10) 


-0.42 (0.97) 


<0.0001 


-2.22 (0.63) 


-1.90 (0.91) 


<0.0001 


87 


0.24 (0.93) 


-0.26 (1.13) 


-0.49 (0.86) 


<0.0001 


-2.68 (0.44) 


-2.55 (1.07) 


<0.0001 


20 


-0.11 (0.82) 


-0.50 (1.12) 


-0.39 (0.99) 


<0.0001 


-2.40 (0.64) 


-1.64 (0.88) 


<0.0001 


95 


-0.11 (0.95) 


-0.44 (1.11) 


-0.33 (0.91) 


<0.0001 


-2.30 (0.66) 


-1.54 (0.90) 


<0.0001 


111 
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-0.42 (1.29) 
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<0.0001 


-2.22 (0.49) 


-1.90 (0.79) 


<0.0001 


37 


-0.10 (0.89) 


-0.49 (0.90) 


-0.40 (1.01) 


<0.0001 





Table III. Continued 
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Resultados percibidos por el paciente antes y 
despues del tratamiento del trastorno depresivo 
mayor 

Los resultados percibidos por el paciente (PROs) sobre 
la calidad de vida, el funcionamiento y la gravedad de 
los sintomas depresivos son importantes para la eva- 
luacion de la carga de enfermedad y para medir el im- 
pacto del tratamiento del trastorno depresivo mayor 
(TDM). Se intenta proporcionar un analisis detallado de 
los PROs antes y despues del tratamiento del TDM a 
partir del gran estudio STAR*D (Sequenced Treatment 
Alternatives to Relieve Depression). Este analisis exa- 
mina los PROs antes y despues del tratamiento en el 
segundo nivel del STAR*D. Los datos completos sobre 
calidad de vida, funcionamiento y gravedad de los sin- 
tomas depresivos se analizaron para cada tratamiento 
del nivel 2 del STAR*D. Los PROs de calidad de vida, 
funcionamiento y gravedad de los sintomas depresi- 
vos mostraron un deterioro significativo despues de 
fallar el ensayo con citalopram, un inhibidor selectivo 
de la recaptura de serotonina (en el nivel 1). Las siete 
opciones terapeuticas del nivel 2 tuvieron un impacto 
estadistica (valores de p) y clinicamente (diferencias es- 
tandarizadas de Cohen [d de Cohen]) significativo en 
cuanto a calidad de vida, funcionamiento, gravedad 
de los sintomas depresivos y reduccion en el calculo de 
la carga de enfermedad. No hubo diferencias estadis- 
ticamente significativas entre las intervenciones. Sin 
embargo, una proporcion significativa de pacientes 
mantuvo un deterioro en la calidad de vida y el funcio- 
namiento despues del tratamiento, y el efecto fue mas 
pronunciado en aquellos que no remitieron. Los PROs 
son clave para la comprension del impacto del TDM y 
para examinar los efectos de las intervenciones tera- 
peuticas tanto en investigacidn como en clinica. 



Resultats rapportes par les patients avant et apres 
traitement d'un episode depressif majeur 



Les resultats rapportes par les patients ou PRO fPatient 
Reported OutcomesJ de qualite de vie (OdV), de fonc- 
tionnement, et de severite du symptome depressif sont 
importants dans I'evaluation du fardeau de I'episode 
depressif majeur (EDM) et de I'impact du traitement. 
Nous avons cherche a analyser de faqon detaillee les 
PRO avant et apres le traitement d'un EDM au cours 
de la deuxieme etape de la grande etude STAR*D 
fSequenced Treatment Alternatives to Relieve Depres- 
sionj. Les donnees completes de OdV, fonctionnement 
et severite du symptome depressif sont analysees pour 
chaque traitement de I'etape 2 de STAR*D. Dans I'etape 

I de I'etude, apres I'echec du citalopram, un inhibiteur 
selectif de recapture de la serotonine, les PRO de QdV, 
de fonctionnement et de severite du symptome depres- 
sif etaient tres mediocres. L'impact des sept choix the- 
rapeutiques de I'etape 2 sur la QdV, le fonctionnement, 
la severite du symptome depressif et la diminution du 
fardeau calcule de la maladie, a montre des differences 
statistiquement (valeurs de p) et cliniquement (diffe- 
rences standardisees de Cohen [d de Cohen]) positives. 

II n'y a pas de differences statistiquement significatives 
entre les traitements. Une proportion importante de 
patients continue neanmoins a souffrir apres le traite- 
ment, selon les resultats declares par les patients sur la 
QdV et le fonctionnement, et de faqon plus pronon- 
cee chez ceux qui ne sont pas en remission. Les PRO 
sont essentiels pour comprendre I'impact de I'EDM et 
pour observer les effets du traitement, a la fois pour la 
recherche et la pratique clinique. 
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